E2E MULTI/EXEC/WATCH Support #619

sjpotter · 2023-05-11T13:52:20Z

Leverages redis/redis#12159 and redis/redis#12219 to enable End to End Multi/Exec/Watch support.

redis/redis#12159 - Enables RedisRaft to have fully implement session support where watches an be tracked and validated for dirtyness. On snapshot we save all current session dirtyness and on restore include that state in the dirtyness check at exec apply time.

redis/redis#12219 - Enables RedisRaft to have finer grain control over command interception. This enables us to handle "PING" and "SHUTDOWN" within a multi correctly. Normally, we don't want to intercept them, but if our client is in a multi state, we do (to prevent shutdown from working, as in normal redis, and to have PING's "PONG" response be part of the MULTI array response).

This PR enables most of the MULTI/EXEC/WATCH tests

codecov-commenter · 2023-05-11T14:26:53Z

Codecov Report

Merging #619 (86233a6) into master (f56edd1) will increase coverage by 0.26%.
The diff coverage is 89.74%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##           master     #619      +/-   ##
==========================================
+ Coverage   59.65%   59.91%   +0.26%     
==========================================
  Files          44       44              
  Lines       15533    15584      +51     
  Branches     1830     1844      +14     
==========================================
+ Hits         9266     9337      +71     
+ Misses       6267     6247      -20

Impacted Files	Coverage Δ
src/commands.c	`91.86% <ø> (ø)`
src/redisraft.h	`100.00% <ø> (ø)`
src/multi.c	`56.18% <58.82%> (+7.08%)`	⬆️
src/snapshot.c	`85.31% <83.33%> (-0.03%)`	⬇️
deps/common/redismodule.h	`100.00% <100.00%> (ø)`
src/clientstate.c	`100.00% <100.00%> (ø)`
src/raft.c	`89.43% <100.00%> (+0.27%)`	⬆️
src/redisraft.c	`73.90% <100.00%> (+0.11%)`	⬆️

... and 3 files with indirect coverage changes

fadidahanna · 2023-05-14T11:53:53Z

src/raft.c

+                    if (req) {
+                        RedisModule_ReplyWithNull(req->ctx);
+                    }
+                    endClientSession(rr, client_session->client_id);


We should unwatch all keys for this client.

see comment below, unsure its necessary.

Did you take a look at Redis exec implementation? please make sure the we remove the client from the watched clients lists somewhere in our use case.

fadidahanna · 2023-05-14T12:41:36Z

src/raft.c

@@ -412,6 +428,17 @@ RedisModuleCallReply *RaftExecuteCommandArray(RedisRaftCtx *rr,
        *    (although no harm is done).
        */
        if (i == 0 && cmdlen == 5 && !strncasecmp(cmd, "MULTI", 5)) {
+            if (client_session) {
+                uint64_t flags = RedisModule_GetClientFlags(client_session->client);
+                if (flags) {


We should check for a specific flag

I don't remember if we discussed this but why don't we pass multi-exec to rm_call()? So, it can return proper reply depending on client's watched keys? Do we have to manually check dirty result?

Its possible, currently we don't add "exec" to the log entry, so that would be a change there, was trying to minimize changes needed for now.

I was thinking that if we pass watch, multi and exec to the RM_Call(), then it will minimize the API and be quite similar to real clients. Then, we don't have to deal with checking the flag before RM_Call() and no need to generate reply manually.

Also, I assume RM_Call() will reset client after EXEC, so we don't have to destroy moduleclient. Currently, existence of moduleclient is tied to watch/multi/exec, I think it does not have to be like that. We may even maintain a moduleclient per client even they are not watching any keys.

No, I don't think we can rely on the client being reset after exec. client state persists across exec (just not necc for our session usage).

It might be reasonable to maintain a module client per connected client, but there might be side effects that we dont understand yet.

I strongly don't want to make this change now, as I think its underestimating the effort and the logical change that we have to the code.

Currently, every time (minus blocking, but that returns a different CallReply type) we do an RM_Call() whether it be within our simulated multi or single commands, it returns a result that we care about.

If we use the persistent client to enqueue, we would get multiple QUEUED responses that we dont care about, and only get a proper response at the very end when we do EXEC().

While this might be an improvement (not convinced, but willing to accept as true), its a fundamental change to a crucial part of the code.

No, I don't think we can rely on the client being reset after exec. client state persists across exec (just not necc for our session usage).

Not sure I understand but I meant dirty flag will be cleaned up automatically, so moduleclient might be reusable. Right now, IIUC you have to free moduleclient once dirty flag is set.

Btw what does necc mean? :) is it shorthand for necessary?

Ok, indeed queued reply is a bit annoying. Overall, still I see moduleclient as a replacement for real client, so it can carry some state as if like a real connection. Maybe we consider this approach in future if we need something else other than watch. If I'm not mistaken, most changes will be in redisraft.

yes, sorry, necc was short for necessary (which I see doesn't make sense as an extra C, should have used nec.

I dont disagree that its possible for the future (though in practice it will just be slower, but perhaps more accurate), as we aren't going to be the queued entries individually on the log, so we will still queue on leader.

fadidahanna · 2023-05-14T12:59:12Z

src/raft.c

+    ClientSession *client_session = NULL;
+    RedisModule_DictDelC(rr->client_session_dict, &id, sizeof(id), &client_session);
+    if (client_session) {
+        freeClientSession(rr, client_session);


maybe we should call unwatch all?

It shouldn't do anything, it should be no different than calling watch in a normal rm_call(). we are releasing/resetting the client back to the temp pool.

fadidahanna · 2023-05-14T13:08:10Z

src/raft.c

+        } else {
+            RedisModule_SetContextUser(ctx, NULL);
+        }
+        RedisModule_SetContextClient(ctx, NULL);


why do we need to do it twice?

what twice? in one case we set the ctx's user to null, in the other case we set the ctx's client to null.

in client_session == True case, we already set the ctx's client => if we set the ctx's client to null anyway, no need to set it first at client_session case

yes, we could check it / wrap it in more if/else conditions, but setting it to null always afterwards, even if set up front, seems cleaner.

fadidahanna · 2023-05-14T13:10:13Z

src/raft.c

@@ -376,6 +392,7 @@ RedisModuleCallReply *RaftExecuteCommandArray(RedisRaftCtx *rr,
    RedisModuleCallReply *reply = NULL;
    RedisModuleUser *user = NULL;
    RedisModuleCtx *ctx = req ? req->ctx : rr->ctx;
+    bool is_multi_session = false;


The name is a little bit confusing - it's not a multi session, but a multi session with a valid persist client

I guess in my head this is a multi using session support, not a multi without session support. i.e. is_multi would be if just a multi is multi_session is a multi when we are handling a session.

fadidahanna · 2023-05-14T13:14:09Z

src/snapshot.c

@@ -574,6 +574,7 @@ static void clientSessionRDBLoad(RedisModuleIO *rdb)
        unsigned long long id = RedisModule_LoadUnsigned(rdb);
        client_session->client_id = id;
        client_session->local = false;
+        client_session->client = RedisModule_CreateModuleClient(rr->ctx);


What about rdbSave? don't we want to save the client state too? (dirty state for example)

If missing, maybe we can add a test for snapshot & snapshot delivery. Just to hit those lines where we save/restore sessions

fadidahanna · 2023-05-21T12:24:55Z

src/raft.c

+        } else {
+            RedisModule_SetContextUser(ctx, NULL);
+        }
+        RedisModule_SetContextClient(ctx, NULL);


in client_session == True case, we already set the ctx's client => if we set the ctx's client to null anyway, no need to set it first at client_session case

fadidahanna · 2023-05-21T12:26:19Z

src/raft.c

+                    if (req) {
+                        RedisModule_ReplyWithNull(req->ctx);
+                    }
+                    endClientSession(rr, client_session->client_id);


Did you take a look at Redis exec implementation? please make sure the we remove the client from the watched clients lists somewhere in our use case.

fadidahanna · 2023-05-21T12:37:08Z

src/raft.c

@@ -359,6 +376,16 @@ void handleUnblock(RedisModuleCtx *ctx, RedisModuleCallReply *reply, void *priva
    freeBlockedCommand(bc);
 }

+static bool isClientSessionDirty(ClientSession *client_session)
+{
+    uint64_t flags = RedisModule_GetClientFlags(client_session->client);


We should check for a specific dirty flag

fadidahanna · 2023-05-21T12:37:40Z

src/snapshot.c

@@ -707,6 +709,11 @@ static void clientSessionRDBSave(RedisModuleIO *rdb)
    ClientSession *client_session;
    while (RedisModule_DictNextC(iter, NULL, (void **) &client_session) != NULL) {
        RedisModule_SaveUnsigned(rdb, client_session->client_id);
+        if (RedisModule_GetClientFlags(client_session->client)) {


see my comment for isClientSessionDirty()

…atch

note - we don't handle discard correctly yet (tearing down session)

tezc · 2023-05-29T07:46:35Z

src/raft.c

@@ -434,17 +471,27 @@ RedisModuleCallReply *RaftExecuteCommandArray(RedisRaftCtx *rr,
         * When we have an ACL, we will have a user set on the context, so need "C"
         */
        char *resp_call_fmt;
-        if (cmds->cmd_flags & CMD_SPEC_MULTI) {
+        if (cmds->cmd_flags & CMD_SPEC_MULTI || client_session) {


Don't know what Redis currently does but can't we send WATCH and then BLPOP (without multi)?
Btw, a test case might be good for this depending on the answer.

needs to fix.

sjpotter requested review from tezc and fadidahanna May 11, 2023 13:52

fadidahanna reviewed May 14, 2023

View reviewed changes

fadidahanna reviewed May 21, 2023

View reviewed changes

sjpotter added 19 commits May 23, 2023 14:47

test watch support with redis PR

c3e8ef9

linter

afa5440

fix blocking commands in multi

833cefc

change acl/user handling for usage with persistent clients

aa9ba04

fix

6d7ac5d

add dirty flag support to snapshot/restore and use at apply time

f3e4d5e

linter

468cd21

update redis pr

8748578

add some more tests for blocking commands in multi with and without w…

2bd6235

…atch

remove duplication from tests

61790dd

update redis pr

28859f2

add tests for unwatch and discard

f3169e0

note - we don't handle discard correctly yet (tearing down session)

add discard support to clear watches

464c36f

linter

3fd3e09

add test for exec abort situation with watches

752e45c

enable most watch/multi/exec tests

7dbff8b

fix

d64fff3

update git ref for redis

0eeddad

comment ping changes + update redissuite skip test list/comments

e931548

sjpotter force-pushed the watch-support branch from f241a54 to e931548 Compare May 23, 2023 12:33

sjpotter added 5 commits May 23, 2023 15:34

update redis ref

418ac5b

pass almost all tests

90fe66c

another monitor test that fails due to raft prefix

b3708cb

linter issues

b6332c2

remove redundant call

0ca22ba

sjpotter added 2 commits May 24, 2023 11:39

py linter

cebaa5d

fix spelling error

86233a6

sjpotter changed the title ~~test watch support with redis PR~~ E2E MULTI/EXEC/WATCH Support May 28, 2023

tezc reviewed May 29, 2023

View reviewed changes

sjpotter added 2 commits May 30, 2023 12:52

fix blocking commands when sessions are active + test

10795b6

linter

6013dba

E2E MULTI/EXEC/WATCH Support #619

Are you sure you want to change the base?

E2E MULTI/EXEC/WATCH Support #619

Uh oh!

Conversation

sjpotter commented May 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented May 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sjpotter commented May 11, 2023 •

edited

Loading

codecov-commenter commented May 11, 2023 •

edited

Loading